home *** CD-ROM | disk | FTP | other *** search
- Path: keats.ugrad.cs.ubc.ca!not-for-mail
- From: c2a192@ugrad.cs.ubc.ca (Kazimir Kylheku)
- Newsgroups: comp.lang.c
- Subject: Re: unique id for a string
- Date: 21 Feb 1996 10:57:50 -0800
- Organization: Computer Science, University of B.C., Vancouver, B.C., Canada
- Message-ID: <4gfpveINNe68@keats.ugrad.cs.ubc.ca>
- References: <1996Feb16.175601.114182@kuhub.cc.ukans.edu> <3129dd39.961626@news.iquest.net>
- NNTP-Posting-Host: keats.ugrad.cs.ubc.ca
-
- In article <3129dd39.961626@news.iquest.net>,
- Robert B. Clark <rclark@iquest.net> wrote:
- >On 16 Feb 96 17:56:01 CST, anh@kuhub.cc.ukans.edu wrote:
- >
- >>HELLO = H * 26^4 + E * 26 ^3 + L * 26^2 + L * 26^1 + E * 26^0
- >>
- >>But the id is way too big. I need something that fits within a 32-bits int
- >>or a 64-bits long.
- >
- >You're thinking way too hard, perhaps. Look up the algorithm for CRCs
- >(cyclic redundancy checks/codes) in Snippets or most any other popular
- >programmer's reference.
-
- But he wants a _unique_ ID. This is not possible, since the space of possible
- words is much larger than the space of 32- or even 64-bit integers. Even for a
- limited dictionary of around 200,000 words, a function that _guarantees_ a
- unique hash for each indentifier into a 32-bit space is difficult to find.
-
- The problem is that the poster did not give enough context about the actual
- problem he is trying to solve (unless this is just a homework exercise to find
- such a mapping). Perhaps the problem can be effectively solved without having
- to compute such a mapping, which is probably of little worth.
-
- One way to get a unique mapping is to simply assign increasing numbers to the
- 200,000+ words, and use a hashing technique to quickly look up a word given its
- integer tag, and look up the tag given the word. With the proper
- represenatation, both operations can be done in "constant time".
- --
-
-